فیلترها/جستجو در نتایج    

فیلترها

سال

بانک‌ها



گروه تخصصی









متن کامل


اطلاعات دوره: 
  • سال: 

    2024
  • دوره: 

    20
  • شماره: 

    4
  • صفحات: 

    8-22
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    10
  • دانلود: 

    0
چکیده: 

With the exponential growth of unstructured data on the Web and social networks, extracting relevant information from multiple sources; has become increasingly challenging, necessitating the need for automated summarization systems. However, developing machine learning-based summarization systems largely depends on datasets, which must be evaluated to determine their usefulness in retrieving data. In most cases, these datasets are summarized with humans’ involvement. Nevertheless, this approach is inadequate for some low-resource languages, making summarization a daunting task. To address this, this paper proposes a method for developing the first abstractive text summarization corpus with human evaluation and automated summarization model for the Sorani Kurdish language. The researchers compiled various documents from information available on the Web (rudaw), and the resulting corpus was released publicly. A customized and simplified version of the mT5-base transformer was then developed to evaluate the corpus. The model's performance was assessed using criteria such as Rouge-1, Rouge-2, Rouge-L, N-gram novelty, manual evaluation and the results are close to reference summaries in terms of all the criteria. This unique Sorani Kurdish corpus and automated summarization model have the potential to pave the way for future studies, facilitating the development of improved summarization systems in low-resource languages.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 10

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    2022
  • دوره: 

    10
  • شماره: 

    1 (37)
  • صفحات: 

    68-79
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    86
  • دانلود: 

    0
چکیده: 

Before the advent of the World Wide Web, lack of information was a problem. But with the advent of the web today, we are faced with an explosive amount of information in every area of search. This extra information is troublesome and prevents a quick and correct decision. This is the problem of information overload. Multi-document summarization is an important solution for this problem by producing a brief summary containing the most important information from a set of documents in a short time. This summary should preserve the main concepts of the documents. When the input documents are related to a specific domain, for example, medicine or law, summarization faces more challenges. Domain-oriented summarization methods use special characteristics related to that domain to generate summaries. This paper introduces the purpose of multi-document summarization systems and discusses domain-oriented approaches. Various methods have been proposed by researchers for multi-document summarization. This survey reviews the categorizations that authors have made on multi-document summarization methods. We also categorize the multi-document summarization methods into six categories: machine learning, clustering, graph, Latent Dirichlet Allocation (LDA), optimization, and deep learning. We review the different methods presented in each of these groups. We also compare the advantages and disadvantages of these groups. We have discussed the standard datasets used in this field, evaluation measures, challenges and recommendations.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 86

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    2024
  • دوره: 

    2
  • شماره: 

    1
  • صفحات: 

    70-75
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    1
  • دانلود: 

    0
چکیده: 

Abstract— With the increase in textual data generated on the internet and the limited time individuals have for reading, the need for automatic text summarization is more essential than ever. One application of summarization is title generation. The goal of this study, which falls within the field of digital humanities and interdisciplinary studies, is to provide a framework for title generation through extractive and abstractive summarization methods, focusing specifically on chapters of the Qur'an. For extractive summarization, eleven different methods have been examined, some of which are novel and innovative. For the abstractive part and title generation, several models have been trained to select the most effective one. In this research, the Persian translation of the Qur'an is used as the primary source, and a dataset was created based on the first ten parts (juz) of the Qur'an, including extractive summaries, abstractive summaries, and titles for various sections of the chapters. The results of this study indicate that the titles generated through summarization are close to human-generated titles, based on BERTScore, R-1, R-2, and R-l values of 21.03, 6.85, 20.73, and 52.51, respectively. It is important to note in the evaluation that a single fixed title does not exist for a document; multiple titles may also be valid. In human evaluation, we observed that the average score produced by the proposed approach is 0.59, while for the best results from other approaches, this value is 0.44.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 1

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
اطلاعات دوره: 
  • سال: 

    2024
  • دوره: 

    2
  • شماره: 

    1
  • صفحات: 

    70-75
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    0
  • دانلود: 

    0
چکیده: 

Abstract— With the increase in textual data generated on the internet and the limited time individuals have for reading, the need for automatic text summarization is more essential than ever. One application of summarization is title generation. The goal of this study, which falls within the field of digital humanities and interdisciplinary studies, is to provide a framework for title generation through extractive and abstractive summarization methods, focusing specifically on chapters of the Qur'an. For extractive summarization, eleven different methods have been examined, some of which are novel and innovative. For the abstractive part and title generation, several models have been trained to select the most effective one. In this research, the Persian translation of the Qur'an is used as the primary source, and a dataset was created based on the first ten parts (juz) of the Qur'an, including extractive summaries, abstractive summaries, and titles for various sections of the chapters. The results of this study indicate that the titles generated through summarization are close to human-generated titles, based on BERTScore, R-1, R-2, and R-l values of 21.03, 6.85, 20.73, and 52.51, respectively. It is important to note in the evaluation that a single fixed title does not exist for a document; multiple titles may also be valid. In human evaluation, we observed that the average score produced by the proposed approach is 0.59, while for the best results from other approaches, this value is 0.44.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 0

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    1381
  • دوره: 

    4
  • شماره: 

    3 (مسلسل 15)
  • صفحات: 

    35-41
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    1838
  • دانلود: 

    348
کلیدواژه: 
چکیده: 

امروزه، به دلیل افزایش حجم اطلاعات درباره موضوعات مختلف، سامانه های استخراج اطلاعات از اهمیت خاصی برخوردارند، اما از آن مهمتر سامانه ای است که بتواند چکیده یا خلاصه ای از مجموعه اطلاعات بازیابی شده را به کاربر ارایه دهد. این مقاله، رهیافتی در زمینه تولید خلاصه از چندین نوشتار ارایه می کند، به طوری که بتوان با استفاده از اطلاعات چندمقاله یا متن و استخراج نکات مهم آن و برقراری ارتباط بین آنها، به یک چکیده واحد از میان آنها رسید و آن را در اختیار استفاده کننده قرار داد. یک سامانه خلاصه ساز چند نوشتاری، متفاوت از خلاصه ساز تک نوشتاری است و این تفاوت به عواملی از قبیل فشردگی، سرعت، عدم تکرار، خوانایی و مرتبط بودن جملات خلاصه تولیدی با یکدیگر مربوط است. هدف این مقاله ارایه الگو برای ایجاد چنین سامانه ای می باشد.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 1838

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 348 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    1394
  • دوره: 

    5
  • شماره: 

    2
  • صفحات: 

    483-508
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    1371
  • دانلود: 

    470
چکیده: 

لطفا برای مشاهده چکیده به متن کامل (PDF) مراجعه فرمایید.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 1371

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 470 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 2
مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources
اطلاعات دوره: 
  • سال: 

    2010
  • دوره: 

    7
  • شماره: 

    3
  • صفحات: 

    15-32
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    377
  • دانلود: 

    0
کلیدواژه: 
چکیده: 

Due to the explosive growth of the world-wide web, automatic text summarization has become an essential tool for web users. In this pa- per, we present a novel approach for creating text summaries. Using fuzzy logic and word-net, our model extracts the most relevant sentences from an original document. The approach utilizes fuzzy measures and inference on the extracted textual information from the document to find the most significant sentences. Experimental results reveal that the proposed approach extracts the most relevant sentences when compared to other commercially available text summarizers. Text pre-processing based on word-net and fuzzy analysis is the main part of our work.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 377

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
نویسندگان: 

NAZARI N. | Mahdavi M. A.

اطلاعات دوره: 
  • سال: 

    2019
  • دوره: 

    7
  • شماره: 

    1
  • صفحات: 

    121-135
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    200
  • دانلود: 

    0
چکیده: 

A survey on Automatic Text SummariText summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such a massive amount of data to extract useful information is a significant undertaking, and requires an automatic mechanism to aid with the extant repository of information. The text summarization systems intent to assist with content reduction keeping the relevant information and filtering the non-relevant parts of the text. In terms of the input, there are two fundamental approaches among the text summarization systems. The first approach summarizes a single document. In other words, the system takes one document as an input and produces a summary version as its output. An alternative approach is to take several documents as its input and produce a single summary document as its output. In terms of output, the summarization systems are also categorized into two major types. One approach would be to extract exact sentences from the original document to build the summary output. An alternative would be a more complex approach, in which the rendered text is a rephrased version of the original document. This paper will offer an in-depth introduction to automatic text summarization. We also mention some evaluation techniques to evaluate the quality of automatic text summarization. zation

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 200

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
نویسندگان: 

MIRSHOJAEI SEYED HOSSEIN | MASOUMI BEHROOZ

اطلاعات دوره: 
  • سال: 

    2015
  • دوره: 

    8
  • شماره: 

    2
  • صفحات: 

    19-24
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    265
  • دانلود: 

    0
چکیده: 

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extractive-based text summarization seems to be an unsolvable problem. Therefore, to deal with such problems, meta-heuristic techniques are applied as a solution. In this paper, we used Cuckoo Search Optimization Algorithm (CSOA) to improve performance of extractive-based summarization method. The proposed approach is examined on Doc. 2002 standard documents and analyzed by Rouge evaluation software. The obtained results indicate better performance of proposed method compared with other similar techniques.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 265

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
اطلاعات دوره: 
  • سال: 

    1396
  • دوره: 

    7
  • شماره: 

    4
  • صفحات: 

    315-325
تعامل: 
  • استنادات: 

    0
  • بازدید: 

    744
  • دانلود: 

    146
چکیده: 

یکی از حوزه های مهم در پدافند غیرعامل، شناسایی تهدیدات و اعلام هشدار است. یکی از روش های پرکاربرد در حوزه شناسایی بررسی داده های ویدئویی به منظور شناسایی اهداف ناشناس و اعلام هشدار است. به منظور بررسی سریع، با دقت بالا روش های خلاصه سازی ویدیو ارائه شده است. همچنین در طول سال های گذشته، ایجاد ویدیو دیجیتالی منجر به رشد نمایی محتوای ویدئویی شده است. به منظور افزایش قابلیت استفاده از این حجم بالای ویدیو، تحقیقات بسیاری به انجام رسیده و خلاصه سازی ویدیو به جهت مرور سریع این مجموعه ویدئویی بزرگ و برای کمک به فهم سریع محتوای داده های ویدئویی پیشنهاد شده است. در خلاصه سازی ویدیو، تصاویری به عنوان نماینده از هر صحنه انتخاب می شود تا مروری تصویری از تمام فیلم به دست آید. اخیرا روش هایی با استفاده از فرمول بندی تنک برای خلاصه سازی ویدیو، داده های ویدئویی را به میزان زیادی نسبت به دیگر روش ها خلاصه نموده اند. در این مقاله به خلاصه سازی ویدیو، به عنوان یک مساله انتخاب واژه نامه تنک پرداخته می شود. بدین منظور، با استفاده از روشی جدید بر پایه کدینگ تنک، می توان به میزان زیادی خلاصه سازی داده های ویدئویی را نسبت به دیگر روش های خلاصه سازی ویدیو که با روش تنک و یا روش های دیگر پیشنهادشده اند بهبود بخشید. اساس این روش بر پایه حل معادله بهینه سازی با استفاده از آستانه گذاری نرم است که پیچیدگی کمتری نسبت به روش های پیشنهادشده اخیر است. این امر را می توان با بررسی میزان پیچیدگی روش پیشنهادی با روش های متداول اخیر متوجه شد. در انتها نتایج آزمایش برای مجموعه داده های معیاری زمین حقیقت و با روش های State of the art، ادعای ما در بهبود میزان خلاصه سازی روش پیشنهادی را نشان می دهد.

شاخص‌های تعامل:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

بازدید 744

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesدانلود 146 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesاستناد 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resourcesمرجع 0
litScript
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button